Linguistically Informed Statistical Models of Constituent Structure for Ordering in Sentence Realization
نویسندگان
چکیده
We present several statistical models of syntactic constituent order for sentence realization. We compare several models, including simple joint models inspired by existing statistical parsing models, and several novel conditional models. The conditional models leverage a large set of linguistic features without manual feature selection. We apply and evaluate the models in sentence realization for French and German and find that a particular conditional model outperforms all others. We employ a version of that model in an evaluation on unordered trees from the Penn TreeBank. We offer this result on standard data as a reference-point for evaluations of ordering in sentence realization.
منابع مشابه
Amalgam: A machine-learned generation module
Amalgam is a novel system for sentence realization during natural language generation. Amalgam takes as input a logical form graph, which it transforms through a series of modules involving machine-learned and knowledge-engineered sub-modules into a syntactic representation from which an output sentence is read. Amalgam constrains the search for a fluent sentence realization by following a ling...
متن کاملDo Heavy-NP Shift Phenomenon and Constituent Ordering in English Cause Sentence Processing Difficulty for EFL Learners?
Heavy-NP shift occurs when speakers prefer placing lengthy or “heavy” noun phrase direct objects in the clause-final position within a sentence rather than in the post-verbal position. Two experiments were conducted in this study, and their results suggested that having a long noun phrase affected the ordering of constituents (the noun phrase and prepositional phrase) by advanced Iranian EFL le...
متن کاملMinimal Dependency Length in Realization Ranking
Comprehension and corpus studies have found that the tendency to minimize dependency length has a strong influence on constituent ordering choices. In this paper, we investigate dependency length minimization in the context of discriminative realization ranking, focusing on its potential to eliminate egregious ordering errors as well as better match the distributional characteristics of sentenc...
متن کاملAn Overview of Amalgam: A Machine-learned Generation Module
We present an overview of Amalgam, a sentence realization module that combines machine-learned and knowledgeengineered components to produce natural language sentences from logical form inputs. We describe the decomposition of the task of sentence realization into a linguistically informed series of steps, with particular attention to the linguistic issues that arise in German. We report on the...
متن کاملExtending Phrase-Based Decoding with a Dependency-Based Reordering Model
Phrase-based decoding is conceptually simple and straightforward to implement, at the cost of drastically oversimplified reordering models. Syntactically aware models make it possible to capture linguistically relevant relationships in order to improve word order, but they can be more complex to implement and optimise. In this paper, we explore a new middle ground between phrase-based and synta...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004